Overview

Dataset statistics

Number of variables34
Number of observations81412
Missing cells326565
Missing cells (%)11.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 MiB
Average record size in memory127.1 B

Variable types

Numeric13
Categorical14
Boolean7

Alerts

insulin is highly correlated with change and 1 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with change and 1 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with change and 1 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
diabetesMed is highly correlated with insulin and 1 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
insulin is highly correlated with change and 1 other fieldsHigh correlation
diabetesMed is highly correlated with change and 1 other fieldsHigh correlation
race has 1827 (2.2%) missing values Missing
age has 2336 (2.9%) missing values Missing
weight has 78913 (96.9%) missing values Missing
admission_type_code has 9366 (11.5%) missing values Missing
discharge_disposition_code has 4276 (5.3%) missing values Missing
admission_source_code has 5641 (6.9%) missing values Missing
payer_code has 32278 (39.6%) missing values Missing
medical_specialty has 40020 (49.2%) missing values Missing
num_lab_procedures has 1493 (1.8%) missing values Missing
num_medications has 2678 (3.3%) missing values Missing
diag_2 has 1620 (2.0%) missing values Missing
diag_3 has 1133 (1.4%) missing values Missing
max_glu_serum has 77159 (94.8%) missing values Missing
A1Cresult has 67807 (83.3%) missing values Missing
admission_id is uniformly distributed Uniform
admission_id has unique values Unique
num_procedures has 37355 (45.9%) zeros Zeros
number_outpatient has 67984 (83.5%) zeros Zeros
number_emergency has 72350 (88.9%) zeros Zeros
number_inpatient has 53995 (66.3%) zeros Zeros

Reproduction

Analysis started2022-01-29 17:36:57.587297
Analysis finished2022-01-29 17:37:37.221163
Duration39.63 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

admission_id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct81412
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40705.5
Minimum0
Maximum81411
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:37.333355image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4070.55
Q120352.75
median40705.5
Q361058.25
95-th percentile77340.45
Maximum81411
Range81411
Interquartile range (IQR)40705.5

Descriptive statistics

Standard deviation23501.76439
Coefficient of variation (CV)0.5773609069
Kurtosis-1.2
Mean40705.5
Median Absolute Deviation (MAD)20353
Skewness0
Sum3313916166
Variance552332929.7
MonotonicityStrictly increasing
2022-01-29T17:37:37.467416image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
791801
 
< 0.1%
198111
 
< 0.1%
177621
 
< 0.1%
239051
 
< 0.1%
218561
 
< 0.1%
771351
 
< 0.1%
750861
 
< 0.1%
812291
 
< 0.1%
689391
 
< 0.1%
Other values (81402)81402
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
814111
< 0.1%
814101
< 0.1%
814091
< 0.1%
814081
< 0.1%
814071
< 0.1%
814061
< 0.1%
814051
< 0.1%
814041
< 0.1%
814031
< 0.1%
814021
< 0.1%

patient_id
Real number (ℝ≥0)

Distinct60069
Distinct (%)73.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108639531.1
Minimum198
Maximum379005166
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:37.606279image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum198
5-th percentile2914233.3
Q146839060
median90834372
Q3175111722
95-th percentile222996661.2
Maximum379005166
Range379004968
Interquartile range (IQR)128272662

Descriptive statistics

Standard deviation77324533.16
Coefficient of variation (CV)0.7117531928
Kurtosis-0.3648068986
Mean108639531.1
Median Absolute Deviation (MAD)65849112
Skewness0.4666683655
Sum8.844561509 × 1012
Variance5.979083428 × 1015
MonotonicityNot monotonic
2022-01-29T17:37:37.731714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17757171033
 
< 0.1%
17645500820
 
< 0.1%
18121953619
 
< 0.1%
18541863018
 
< 0.1%
4728673818
 
< 0.1%
332051418
 
< 0.1%
18232048817
 
< 0.1%
4639797017
 
< 0.1%
8628174016
 
< 0.1%
16885715416
 
< 0.1%
Other values (60059)81220
99.8%
ValueCountFrequency (%)
1982
< 0.1%
6841
 
< 0.1%
13861
 
< 0.1%
14761
 
< 0.1%
17821
 
< 0.1%
22324
< 0.1%
25381
 
< 0.1%
25563
< 0.1%
31861
 
< 0.1%
39781
 
< 0.1%
ValueCountFrequency (%)
3790051661
< 0.1%
3787021181
< 0.1%
3786987881
< 0.1%
3786641021
< 0.1%
3785156201
< 0.1%
3784314521
< 0.1%
3783585701
< 0.1%
3782586341
< 0.1%
3781945901
< 0.1%
3781511561
< 0.1%

race
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing1827
Missing (%)2.2%
Memory size79.8 KiB
white
60873 
black
15388 
hispanic
 
1627
other
 
1180
asian
 
517

Length

Max length8
Median length5
Mean length5.061330653
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwhite
2nd rowwhite
3rd rowwhite
4th rowblack
5th rowwhite

Common Values

ValueCountFrequency (%)
white60873
74.8%
black15388
 
18.9%
hispanic1627
 
2.0%
other1180
 
1.4%
asian517
 
0.6%
(Missing)1827
 
2.2%

Length

2022-01-29T17:37:37.860967image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:37.933320image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
white60873
76.5%
black15388
 
19.3%
hispanic1627
 
2.0%
other1180
 
1.5%
asian517
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size79.8 KiB
female
43719 
male
37691 

Length

Max length6
Median length6
Mean length5.074044958
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowmale
3rd rowfemale
4th rowfemale
5th rowfemale

Common Values

ValueCountFrequency (%)
female43719
53.7%
male37691
46.3%
(Missing)2
 
< 0.1%

Length

2022-01-29T17:37:38.020641image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:38.090415image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
female43719
53.7%
male37691
46.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

age
Real number (ℝ≥0)

MISSING

Distinct10
Distinct (%)< 0.1%
Missing2336
Missing (%)2.9%
Infinite0
Infinite (%)0.0%
Mean60.9550306
Minimum0
Maximum90
Zeros136
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size715.7 KiB
2022-01-29T17:37:38.152109image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q150
median60
Q370
95-th percentile80
Maximum90
Range90
Interquartile range (IQR)20

Descriptive statistics

Standard deviation15.97400516
Coefficient of variation (CV)0.2620621301
Kurtosis0.2920675104
Mean60.9550306
Median Absolute Deviation (MAD)10
Skewness-0.6355593598
Sum4820080
Variance255.1688408
MonotonicityNot monotonic
2022-01-29T17:37:38.236059image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
7020261
24.9%
6017414
21.4%
5013414
16.5%
8013383
16.4%
407498
 
9.2%
302964
 
3.6%
902172
 
2.7%
201297
 
1.6%
10537
 
0.7%
0136
 
0.2%
(Missing)2336
 
2.9%
ValueCountFrequency (%)
0136
 
0.2%
10537
 
0.7%
201297
 
1.6%
302964
 
3.6%
407498
 
9.2%
5013414
16.5%
6017414
21.4%
7020261
24.9%
8013383
16.4%
902172
 
2.7%
ValueCountFrequency (%)
902172
 
2.7%
8013383
16.4%
7020261
24.9%
6017414
21.4%
5013414
16.5%
407498
 
9.2%
302964
 
3.6%
201297
 
1.6%
10537
 
0.7%
0136
 
0.2%

weight
Real number (ℝ≥0)

MISSING

Distinct9
Distinct (%)0.4%
Missing78913
Missing (%)96.9%
Infinite0
Infinite (%)0.0%
Mean73.66946779
Minimum0
Maximum200
Zeros40
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size715.7 KiB
2022-01-29T17:37:38.323598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q150
median75
Q3100
95-th percentile125
Maximum200
Range200
Interquartile range (IQR)50

Descriptive statistics

Standard deviation26.23034153
Coefficient of variation (CV)0.3560544459
Kurtosis1.55183219
Mean73.66946779
Median Absolute Deviation (MAD)25
Skewness0.3830941591
Sum184100
Variance688.0308168
MonotonicityNot monotonic
2022-01-29T17:37:38.415212image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
751037
 
1.3%
50713
 
0.9%
100482
 
0.6%
125117
 
0.1%
2572
 
0.1%
040
 
< 0.1%
15027
 
< 0.1%
1758
 
< 0.1%
2003
 
< 0.1%
(Missing)78913
96.9%
ValueCountFrequency (%)
040
 
< 0.1%
2572
 
0.1%
50713
0.9%
751037
1.3%
100482
0.6%
125117
 
0.1%
15027
 
< 0.1%
1758
 
< 0.1%
2003
 
< 0.1%
ValueCountFrequency (%)
2003
 
< 0.1%
1758
 
< 0.1%
15027
 
< 0.1%
125117
 
0.1%
100482
0.6%
751037
1.3%
50713
0.9%
2572
 
0.1%
040
 
< 0.1%

admission_type_code
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing9366
Missing (%)11.5%
Memory size79.8 KiB
Emergency
42562 
Elective
14884 
Urgent
14576 
Trauma Center
 
16
Newborn
 
8

Length

Max length13
Median length9
Mean length8.187130444
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowElective
2nd rowUrgent
3rd rowEmergency
4th rowEmergency
5th rowUrgent

Common Values

ValueCountFrequency (%)
Emergency42562
52.3%
Elective14884
 
18.3%
Urgent14576
 
17.9%
Trauma Center16
 
< 0.1%
Newborn8
 
< 0.1%
(Missing)9366
 
11.5%

Length

2022-01-29T17:37:38.524980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:38.598865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
emergency42562
59.1%
elective14884
 
20.7%
urgent14576
 
20.2%
trauma16
 
< 0.1%
center16
 
< 0.1%
newborn8
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

discharge_disposition_code
Categorical

MISSING

Distinct23
Distinct (%)< 0.1%
Missing4276
Missing (%)5.3%
Memory size80.4 KiB
Discharged to home
47854 
Discharged/transferred to SNF
11097 
Discharged/transferred to home with home health service
10244 
Discharged/transferred to another short term hospital
 
1690
Discharged/transferred to another rehab fac including rehab units of a hospital
 
1580
Other values (18)
 
4671

Length

Max length105
Median length18
Mean length27.38819747
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDischarged to home
2nd rowDischarged to home
3rd rowDischarged to home
4th rowDischarged to home
5th rowDischarged/transferred to another rehab fac including rehab units of a hospital

Common Values

ValueCountFrequency (%)
Discharged to home47854
58.8%
Discharged/transferred to SNF11097
 
13.6%
Discharged/transferred to home with home health service10244
 
12.6%
Discharged/transferred to another short term hospital1690
 
2.1%
Discharged/transferred to another rehab fac including rehab units of a hospital1580
 
1.9%
Expired1312
 
1.6%
Discharged/transferred to another type of inpatient care institution936
 
1.1%
Discharged/transferred to ICF650
 
0.8%
Left AMA506
 
0.6%
Discharged/transferred to a long term care hospital324
 
0.4%
Other values (13)943
 
1.2%
(Missing)4276
 
5.3%

Length

2022-01-29T17:37:38.703352image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to74698
25.3%
home68831
23.3%
discharged47859
16.2%
discharged/transferred26701
 
9.0%
snf11097
 
3.8%
health10248
 
3.5%
with10244
 
3.5%
service10244
 
3.5%
another4218
 
1.4%
hospital3829
 
1.3%
Other values (57)27750
 
9.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

admission_source_code
Categorical

MISSING

Distinct14
Distinct (%)< 0.1%
Missing5641
Missing (%)6.9%
Memory size80.3 KiB
Emergency Room
45942 
Physician Referral
23684 
Transfer from a hospital
 
2581
Transfer from another health care facility
 
1819
Clinic Referral
 
878
Other values (9)
 
867

Length

Max length57
Median length14
Mean length16.56794816
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowPhysician Referral
2nd rowEmergency Room
3rd rowEmergency Room
4th rowEmergency Room
5th rowPhysician Referral

Common Values

ValueCountFrequency (%)
Emergency Room45942
56.4%
Physician Referral23684
29.1%
Transfer from a hospital2581
 
3.2%
Transfer from another health care facility1819
 
2.2%
Clinic Referral878
 
1.1%
Transfer from a Skilled Nursing Facility (SNF)681
 
0.8%
HMO Referral148
 
0.2%
Court/Law Enforcement15
 
< 0.1%
Transfer from hospital inpt/same fac reslt in a sep claim10
 
< 0.1%
Transfer from critial access hospital6
 
< 0.1%
Other values (4)7
 
< 0.1%
(Missing)5641
 
6.9%

Length

2022-01-29T17:37:38.802912image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
emergency45942
27.4%
room45942
27.4%
referral24710
14.8%
physician23684
14.1%
transfer5099
 
3.0%
from5099
 
3.0%
a3272
 
2.0%
hospital2597
 
1.6%
facility2500
 
1.5%
care1819
 
1.1%
Other values (26)6825
 
4.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

time_in_hospital
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.395924434
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:38.889714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.975844099
Coefficient of variation (CV)0.6769552444
Kurtosis0.8490950317
Mean4.395924434
Median Absolute Deviation (MAD)2
Skewness1.130728188
Sum357881
Variance8.855648103
MonotonicityNot monotonic
2022-01-29T17:37:38.995508image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
314223
17.5%
213723
16.9%
111302
13.9%
411228
13.8%
57975
9.8%
66059
7.4%
74717
 
5.8%
83528
 
4.3%
92372
 
2.9%
101886
 
2.3%
Other values (4)4399
 
5.4%
ValueCountFrequency (%)
111302
13.9%
213723
16.9%
314223
17.5%
411228
13.8%
57975
9.8%
66059
7.4%
74717
 
5.8%
83528
 
4.3%
92372
 
2.9%
101886
 
2.3%
ValueCountFrequency (%)
14814
 
1.0%
13944
 
1.2%
121153
 
1.4%
111488
 
1.8%
101886
 
2.3%
92372
 
2.9%
83528
4.3%
74717
5.8%
66059
7.4%
57975
9.8%

payer_code
Categorical

MISSING

Distinct17
Distinct (%)< 0.1%
Missing32278
Missing (%)39.6%
Memory size80.3 KiB
MC
25952 
HM
4973 
SP
4003 
BC
3718 
MD
2824 
Other values (12)
7664 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowMC
2nd rowMD
3rd rowMC
4th rowHM
5th rowBC

Common Values

ValueCountFrequency (%)
MC25952
31.9%
HM4973
 
6.1%
SP4003
 
4.9%
BC3718
 
4.6%
MD2824
 
3.5%
CP2053
 
2.5%
UN1925
 
2.4%
CM1551
 
1.9%
OG795
 
1.0%
PO469
 
0.6%
Other values (7)871
 
1.1%
(Missing)32278
39.6%

Length

2022-01-29T17:37:39.100925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mc25952
52.8%
hm4973
 
10.1%
sp4003
 
8.1%
bc3718
 
7.6%
md2824
 
5.7%
cp2053
 
4.2%
un1925
 
3.9%
cm1551
 
3.2%
og795
 
1.6%
po469
 
1.0%
Other values (7)871
 
1.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

medical_specialty
Categorical

MISSING

Distinct24
Distinct (%)0.1%
Missing40020
Missing (%)49.2%
Memory size80.4 KiB
InternalMedicine
11712 
Emergency/Trauma
6021 
Family/GeneralPractice
5939 
Cardiology
4279 
Surgery
4082 
Other values (19)
9359 

Length

Max length22
Median length16
Mean length14.24966177
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEmergency/Trauma
2nd rowInternalMedicine
3rd rowFamily/GeneralPractice
4th rowEmergency/Trauma
5th rowEmergency/Trauma

Common Values

ValueCountFrequency (%)
InternalMedicine11712
 
14.4%
Emergency/Trauma6021
 
7.4%
Family/GeneralPractice5939
 
7.3%
Cardiology4279
 
5.3%
Surgery4082
 
5.0%
Orthopedics2081
 
2.6%
Nephrology1299
 
1.6%
Radiology955
 
1.2%
Psychiatry758
 
0.9%
Pulmonology700
 
0.9%
Other values (14)3566
 
4.4%
(Missing)40020
49.2%

Length

2022-01-29T17:37:39.217394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
internalmedicine11712
28.3%
emergency/trauma6021
14.5%
family/generalpractice5939
14.3%
cardiology4279
 
10.3%
surgery4082
 
9.9%
orthopedics2081
 
5.0%
nephrology1299
 
3.1%
radiology955
 
2.3%
psychiatry758
 
1.8%
pulmonology700
 
1.7%
Other values (14)3566
 
8.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
False
80550 
True
 
862
ValueCountFrequency (%)
False80550
98.9%
True862
 
1.1%
2022-01-29T17:37:39.297613image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.8 KiB
complete
67114 
incomplete
13978 
none
 
320

Length

Max length10
Median length8
Mean length8.327666683
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcomplete
2nd rowcomplete
3rd rowcomplete
4th rowcomplete
5th rowcomplete

Common Values

ValueCountFrequency (%)
complete67114
82.4%
incomplete13978
 
17.2%
none320
 
0.4%

Length

2022-01-29T17:37:39.380359image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:39.458156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
complete67114
82.4%
incomplete13978
 
17.2%
none320
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_lab_procedures
Real number (ℝ≥0)

MISSING

Distinct115
Distinct (%)0.1%
Missing1493
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean43.07119709
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:39.552386image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q132
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)25

Descriptive statistics

Standard deviation19.63040493
Coefficient of variation (CV)0.4557664114
Kurtosis-0.2441755168
Mean43.07119709
Median Absolute Deviation (MAD)13
Skewness-0.2404533528
Sum3442207
Variance385.3527977
MonotonicityNot monotonic
2022-01-29T17:37:39.692596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12533
 
3.1%
432207
 
2.7%
441988
 
2.4%
451884
 
2.3%
381764
 
2.2%
401736
 
2.1%
461720
 
2.1%
411718
 
2.1%
471679
 
2.1%
391662
 
2.0%
Other values (105)61028
75.0%
ValueCountFrequency (%)
12533
3.1%
2858
 
1.1%
3523
 
0.6%
4285
 
0.4%
5231
 
0.3%
6214
 
0.3%
7258
 
0.3%
8281
 
0.3%
9728
 
0.9%
10655
 
0.8%
ValueCountFrequency (%)
1321
 
< 0.1%
1261
 
< 0.1%
1211
 
< 0.1%
1181
 
< 0.1%
1142
< 0.1%
1131
 
< 0.1%
1112
< 0.1%
1093
< 0.1%
1083
< 0.1%
1064
< 0.1%

num_procedures
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.341767798
Minimum0
Maximum6
Zeros37355
Zeros (%)45.9%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:39.806501image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.708464781
Coefficient of variation (CV)1.273293921
Kurtosis0.8445664498
Mean1.341767798
Median Absolute Deviation (MAD)1
Skewness1.313355194
Sum109236
Variance2.918851909
MonotonicityNot monotonic
2022-01-29T17:37:39.909893image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
037355
45.9%
116513
20.3%
210162
 
12.5%
37548
 
9.3%
63994
 
4.9%
43409
 
4.2%
52431
 
3.0%
ValueCountFrequency (%)
037355
45.9%
116513
20.3%
210162
 
12.5%
37548
 
9.3%
43409
 
4.2%
52431
 
3.0%
63994
 
4.9%
ValueCountFrequency (%)
63994
 
4.9%
52431
 
3.0%
43409
 
4.2%
37548
 
9.3%
210162
 
12.5%
116513
20.3%
037355
45.9%

num_medications
Real number (ℝ≥0)

MISSING

Distinct73
Distinct (%)0.1%
Missing2678
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean16.02442401
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:40.032730image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.107234783
Coefficient of variation (CV)0.5059298717
Kurtosis3.427614554
Mean16.02442401
Median Absolute Deviation (MAD)5
Skewness1.316139858
Sum1261667
Variance65.72725582
MonotonicityNot monotonic
2022-01-29T17:37:40.158847image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
134714
 
5.8%
124636
 
5.7%
114457
 
5.5%
154443
 
5.5%
144436
 
5.4%
164234
 
5.2%
104108
 
5.0%
173817
 
4.7%
93813
 
4.7%
183523
 
4.3%
Other values (63)36553
44.9%
ValueCountFrequency (%)
1203
 
0.2%
2357
 
0.4%
3687
 
0.8%
41107
 
1.4%
51570
 
1.9%
62094
2.6%
72662
3.3%
83356
4.1%
93813
4.7%
104108
5.0%
ValueCountFrequency (%)
811
 
< 0.1%
752
 
< 0.1%
741
 
< 0.1%
702
 
< 0.1%
695
< 0.1%
686
< 0.1%
676
< 0.1%
663
 
< 0.1%
658
< 0.1%
647
< 0.1%

number_outpatient
Real number (ℝ≥0)

ZEROS

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3709526851
Minimum0
Maximum42
Zeros67984
Zeros (%)83.5%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:40.283356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.278537861
Coefficient of variation (CV)3.446633256
Kurtosis152.4729964
Mean0.3709526851
Median Absolute Deviation (MAD)0
Skewness8.984331589
Sum30200
Variance1.634659062
MonotonicityNot monotonic
2022-01-29T17:37:40.401366image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
067984
83.5%
16862
 
8.4%
22904
 
3.6%
31618
 
2.0%
4873
 
1.1%
5420
 
0.5%
6238
 
0.3%
7129
 
0.2%
876
 
0.1%
973
 
0.1%
Other values (29)235
 
0.3%
ValueCountFrequency (%)
067984
83.5%
16862
 
8.4%
22904
 
3.6%
31618
 
2.0%
4873
 
1.1%
5420
 
0.5%
6238
 
0.3%
7129
 
0.2%
876
 
0.1%
973
 
0.1%
ValueCountFrequency (%)
421
< 0.1%
401
< 0.1%
391
< 0.1%
381
< 0.1%
371
< 0.1%
361
< 0.1%
351
< 0.1%
341
< 0.1%
332
< 0.1%
292
< 0.1%

number_emergency
Real number (ℝ≥0)

ZEROS

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1975875792
Minimum0
Maximum64
Zeros72350
Zeros (%)88.9%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:40.511181image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum64
Range64
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8812896989
Coefficient of variation (CV)4.460248475
Kurtosis638.5493694
Mean0.1975875792
Median Absolute Deviation (MAD)0
Skewness16.40885167
Sum16086
Variance0.7766715333
MonotonicityNot monotonic
2022-01-29T17:37:40.617561image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
072350
88.9%
16064
 
7.4%
21646
 
2.0%
3588
 
0.7%
4292
 
0.4%
5167
 
0.2%
682
 
0.1%
755
 
0.1%
839
 
< 0.1%
927
 
< 0.1%
Other values (19)102
 
0.1%
ValueCountFrequency (%)
072350
88.9%
16064
 
7.4%
21646
 
2.0%
3588
 
0.7%
4292
 
0.4%
5167
 
0.2%
682
 
0.1%
755
 
0.1%
839
 
< 0.1%
927
 
< 0.1%
ValueCountFrequency (%)
641
 
< 0.1%
461
 
< 0.1%
421
 
< 0.1%
291
 
< 0.1%
281
 
< 0.1%
252
< 0.1%
241
 
< 0.1%
224
< 0.1%
212
< 0.1%
204
< 0.1%

number_inpatient
Real number (ℝ≥0)

ZEROS

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6377929544
Minimum0
Maximum21
Zeros53995
Zeros (%)66.3%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:40.724431image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.265472414
Coefficient of variation (CV)1.984142981
Kurtosis20.83213029
Mean0.6377929544
Median Absolute Deviation (MAD)0
Skewness3.619760625
Sum51924
Variance1.60142043
MonotonicityNot monotonic
2022-01-29T17:37:40.815529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
053995
66.3%
115706
 
19.3%
26057
 
7.4%
32747
 
3.4%
41288
 
1.6%
5641
 
0.8%
6386
 
0.5%
7224
 
0.3%
8120
 
0.1%
993
 
0.1%
Other values (11)155
 
0.2%
ValueCountFrequency (%)
053995
66.3%
115706
 
19.3%
26057
 
7.4%
32747
 
3.4%
41288
 
1.6%
5641
 
0.8%
6386
 
0.5%
7224
 
0.3%
8120
 
0.1%
993
 
0.1%
ValueCountFrequency (%)
211
 
< 0.1%
192
 
< 0.1%
181
 
< 0.1%
171
 
< 0.1%
164
 
< 0.1%
157
 
< 0.1%
146
 
< 0.1%
1316
 
< 0.1%
1230
< 0.1%
1140
< 0.1%

diag_1
Categorical

Distinct17
Distinct (%)< 0.1%
Missing16
Missing (%)< 0.1%
Memory size80.3 KiB
diseases of the circulatory system
24193 
endocrine, nutritional and metabolic diseases, and immunity disorders
9257 
diseases of the respiratory system
8316 
diseases of the digestive system
7368 
symptoms, signs, and ill-defined conditions
6137 
Other values (12)
26125 

Length

Max length69
Median length34
Mean length38.14739054
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdiseases of the musculoskeletal system and connective tissue
2nd rowendocrine, nutritional and metabolic diseases, and immunity disorders
3rd rowdiseases of the circulatory system
4th rowdiseases of the digestive system
5th rowdiseases of the digestive system

Common Values

ValueCountFrequency (%)
diseases of the circulatory system24193
29.7%
endocrine, nutritional and metabolic diseases, and immunity disorders9257
 
11.4%
diseases of the respiratory system8316
 
10.2%
diseases of the digestive system7368
 
9.1%
symptoms, signs, and ill-defined conditions6137
 
7.5%
injury and poisoning5593
 
6.9%
diseases of the genitourinary system4052
 
5.0%
diseases of the musculoskeletal system and connective tissue3956
 
4.9%
neoplasms2762
 
3.4%
infectious and parasitic diseases2200
 
2.7%
Other values (7)7562
 
9.3%

Length

2022-01-29T17:37:40.930711image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases63191
14.9%
of53601
12.6%
the52291
12.3%
system48860
11.5%
and40806
 
9.6%
circulatory24193
 
5.7%
disorders11061
 
2.6%
nutritional9257
 
2.2%
metabolic9257
 
2.2%
immunity9257
 
2.2%
Other values (32)103350
24.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diag_2
Categorical

MISSING

Distinct17
Distinct (%)< 0.1%
Missing1620
Missing (%)2.0%
Memory size80.3 KiB
diseases of the circulatory system
24666 
endocrine, nutritional and metabolic diseases, and immunity disorders
16518 
diseases of the respiratory system
8072 
diseases of the genitourinary system
6294 
symptoms, signs, and ill-defined conditions
3616 
Other values (12)
20626 

Length

Max length69
Median length34
Mean length41.46834269
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowendocrine, nutritional and metabolic diseases, and immunity disorders
2nd rowdiseases of the respiratory system
3rd rowdiseases of the respiratory system
4th rowdiseases of the genitourinary system
5th rowdiseases of the digestive system

Common Values

ValueCountFrequency (%)
diseases of the circulatory system24666
30.3%
endocrine, nutritional and metabolic diseases, and immunity disorders16518
20.3%
diseases of the respiratory system8072
 
9.9%
diseases of the genitourinary system6294
 
7.7%
symptoms, signs, and ill-defined conditions3616
 
4.4%
diseases of the digestive system3154
 
3.9%
diseases of the skin and subcutaneous tissue2822
 
3.5%
diseases of the blood and blood-forming organs2307
 
2.8%
mental disorders2066
 
2.5%
external causes of injury2002
 
2.5%
Other values (7)8275
 
10.2%

Length

2022-01-29T17:37:41.050406image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases67795
15.2%
of52070
11.7%
the50068
11.3%
and47988
10.8%
system44612
10.0%
circulatory24666
 
5.5%
disorders18584
 
4.2%
endocrine16518
 
3.7%
nutritional16518
 
3.7%
metabolic16518
 
3.7%
Other values (32)89609
20.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diag_3
Categorical

MISSING

Distinct17
Distinct (%)< 0.1%
Missing1133
Missing (%)1.4%
Memory size80.3 KiB
diseases of the circulatory system
23979 
endocrine, nutritional and metabolic diseases, and immunity disorders
20960 
diseases of the respiratory system
5447 
diseases of the genitourinary system
5065 
external causes of injury
4011 
Other values (12)
20817 

Length

Max length69
Median length34
Mean length43.17003201
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdiseases of the nervous system and sense organs
2nd rowneoplasms
3rd rowendocrine, nutritional and metabolic diseases, and immunity disorders
4th rowendocrine, nutritional and metabolic diseases, and immunity disorders
5th rowdiseases of the digestive system

Common Values

ValueCountFrequency (%)
diseases of the circulatory system23979
29.5%
endocrine, nutritional and metabolic diseases, and immunity disorders20960
25.7%
diseases of the respiratory system5447
 
6.7%
diseases of the genitourinary system5065
 
6.2%
external causes of injury4011
 
4.9%
symptoms, signs, and ill-defined conditions3611
 
4.4%
diseases of the digestive system2890
 
3.5%
mental disorders2505
 
3.1%
diseases of the skin and subcutaneous tissue2016
 
2.5%
diseases of the blood and blood-forming organs1986
 
2.4%
Other values (7)7809
 
9.6%

Length

2022-01-29T17:37:41.165683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diseases66797
14.5%
and55780
12.1%
of48597
10.6%
the44586
9.7%
system40336
8.8%
circulatory23979
 
5.2%
disorders23465
 
5.1%
endocrine20960
 
4.6%
nutritional20960
 
4.6%
metabolic20960
 
4.6%
Other values (32)92929
20.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

number_diagnoses
Real number (ℝ≥0)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.421964821
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:41.514823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.931480463
Coefficient of variation (CV)0.2602384287
Kurtosis-0.07655270384
Mean7.421964821
Median Absolute Deviation (MAD)1
Skewness-0.8713996895
Sum604237
Variance3.730616778
MonotonicityNot monotonic
2022-01-29T17:37:41.609213image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
939501
48.5%
59144
 
11.2%
88506
 
10.4%
78379
 
10.3%
68133
 
10.0%
44428
 
5.4%
32238
 
2.7%
2824
 
1.0%
1167
 
0.2%
1638
 
< 0.1%
Other values (6)54
 
0.1%
ValueCountFrequency (%)
1167
 
0.2%
2824
 
1.0%
32238
 
2.7%
44428
 
5.4%
59144
 
11.2%
68133
 
10.0%
78379
 
10.3%
88506
 
10.4%
939501
48.5%
1014
 
< 0.1%
ValueCountFrequency (%)
1638
 
< 0.1%
158
 
< 0.1%
145
 
< 0.1%
1314
 
< 0.1%
125
 
< 0.1%
118
 
< 0.1%
1014
 
< 0.1%
939501
48.5%
88506
 
10.4%
78379
 
10.3%

blood_type
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.0 KiB
O+
32053 
A+
24744 
B+
9218 
O-
5689 
A-
4826 
Other values (3)
4882 

Length

Max length3
Median length2
Mean length2.041738319
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA+
2nd rowB+
3rd rowO+
4th rowAB-
5th rowA+

Common Values

ValueCountFrequency (%)
O+32053
39.4%
A+24744
30.4%
B+9218
 
11.3%
O-5689
 
7.0%
A-4826
 
5.9%
AB+2619
 
3.2%
B-1484
 
1.8%
AB-779
 
1.0%

Length

2022-01-29T17:37:41.719279image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:41.792586image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
o37742
46.4%
a29570
36.3%
b10702
 
13.1%
ab3398
 
4.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

hemoglobin_level
Real number (ℝ≥0)

Distinct77
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.19232791
Minimum10.5
Maximum18.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.2 KiB
2022-01-29T17:37:41.899253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum10.5
5-th percentile12.6
Q113.4
median14.1
Q315
95-th percentile16
Maximum18.6
Range8.1
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.059999933
Coefficient of variation (CV)0.07468823574
Kurtosis-0.4492654524
Mean14.19232791
Median Absolute Deviation (MAD)0.8
Skewness0.1878605037
Sum1155425.8
Variance1.123599858
MonotonicityNot monotonic
2022-01-29T17:37:42.026571image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.62964
 
3.6%
13.92911
 
3.6%
13.72876
 
3.5%
13.82841
 
3.5%
13.52796
 
3.4%
14.12745
 
3.4%
142730
 
3.4%
14.22657
 
3.3%
13.42654
 
3.3%
13.32643
 
3.2%
Other values (67)53595
65.8%
ValueCountFrequency (%)
10.52
 
< 0.1%
10.82
 
< 0.1%
10.93
 
< 0.1%
115
 
< 0.1%
11.13
 
< 0.1%
11.218
 
< 0.1%
11.317
 
< 0.1%
11.433
< 0.1%
11.547
0.1%
11.671
0.1%
ValueCountFrequency (%)
18.61
 
< 0.1%
18.22
 
< 0.1%
18.12
 
< 0.1%
181
 
< 0.1%
17.92
 
< 0.1%
17.82
 
< 0.1%
17.72
 
< 0.1%
17.67
 
< 0.1%
17.516
< 0.1%
17.432
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
False
71697 
True
9715 
ValueCountFrequency (%)
False71697
88.1%
True9715
 
11.9%
2022-01-29T17:37:42.115143image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

max_glu_serum
Categorical

MISSING

Distinct3
Distinct (%)0.1%
Missing77159
Missing (%)94.8%
Memory size79.8 KiB
normal
2049 
>200
1179 
>300
1025 

Length

Max length6
Median length4
Mean length4.963555138
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>300
2nd rownormal
3rd row>300
4th row>300
5th row>300

Common Values

ValueCountFrequency (%)
normal2049
 
2.5%
>2001179
 
1.4%
>3001025
 
1.3%
(Missing)77159
94.8%

Length

2022-01-29T17:37:42.186547image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:42.254825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
normal2049
48.2%
2001179
27.7%
3001025
24.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

A1Cresult
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing67807
Missing (%)83.3%
Memory size79.8 KiB
>8
6547 
normal
4003 
>7
3055 

Length

Max length6
Median length2
Mean length3.17692025
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>7
2nd row>8
3rd row>7
4th row>7
5th row>7

Common Values

ValueCountFrequency (%)
>86547
 
8.0%
normal4003
 
4.9%
>73055
 
3.8%
(Missing)67807
83.3%

Length

2022-01-29T17:37:42.333647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T17:37:42.411263image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
86547
48.1%
normal4003
29.4%
73055
22.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

diuretics
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
False
79893 
True
 
1519
ValueCountFrequency (%)
False79893
98.1%
True1519
 
1.9%
2022-01-29T17:37:42.454702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

insulin
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
True
44360 
False
37052 
ValueCountFrequency (%)
True44360
54.5%
False37052
45.5%
2022-01-29T17:37:42.492681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

change
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
False
43772 
True
37640 
ValueCountFrequency (%)
False43772
53.8%
True37640
46.2%
2022-01-29T17:37:42.531292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

diabetesMed
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
True
62718 
False
18694 
ValueCountFrequency (%)
True62718
77.0%
False18694
 
23.0%
2022-01-29T17:37:42.571189image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

readmitted
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.6 KiB
False
72340 
True
9072 
ValueCountFrequency (%)
False72340
88.9%
True9072
 
11.1%
2022-01-29T17:37:42.612371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Interactions

2022-01-29T17:37:32.239313image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:09.511876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.495868image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.507089image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.249044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.032775image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.999456image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.000781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.898660image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.800858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.603169image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.581924image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.363363image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.409457image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:09.685894image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.656994image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.653243image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.401203image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.203936image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.153755image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.148878image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.048783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.940591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.908479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.729187image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.513166image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.556277image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:09.843178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.805023image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.796172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.543734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.359280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.302414image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.308937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.202857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.082237image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.041527image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.871573image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.667034image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.669715image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:09.971216image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.919511image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.905374image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.663434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.483784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.425043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.431396image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.336240image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.200382image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.163963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.986055image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.787728image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.809901image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.128195image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.185535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.052955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.800171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.639189image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.575757image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.588362image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.490581image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.356667image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.313858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.123854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.945488image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.947278image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.286972image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.332275image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.188134image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.030540image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.794981image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.720095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.735030image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.637539image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.513386image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.457427image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.254374image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.083539image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.089163image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.442347image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.467147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.325336image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.157444image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:17.935725image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.854897image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:21.878426image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.782714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.652910image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.589607image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.398145image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.226438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.223256image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.596499image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.612893image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.457835image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.294270image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.082102image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:19.998870image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.024990image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:23.920006image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.792899image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.727052image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.541786image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.370543image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.378647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.751200image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.769843image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.603946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.414101image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.234688image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:20.151252image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.180562image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.070750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:25.935535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:27.897894image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.688031image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.515123image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.521194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:10.889213image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:12.906464image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.732185image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.538280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.409280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:20.284711image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.316486image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.201985image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.062185image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.030515image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.819663image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.649065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.659294image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.034099image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.052210image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.865370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.652421image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.557159image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:20.433522image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.461975image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.344056image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.199528image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.164278image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:29.959120image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.798182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:33.800716image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.181001image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.186815image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:14.979541image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.767752image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.700200image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:20.574372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.590080image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.480856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.329375image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.290601image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.089960image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:31.938994image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:34.128538image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:11.332032image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:13.350161image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:15.110066image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:16.895124image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:18.849315image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:20.845145image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:22.745855image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:24.639053image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:26.466281image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:28.429147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:30.227113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-29T17:37:32.088950image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-01-29T17:37:42.708574image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-29T17:37:42.959003image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-29T17:37:43.189619image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-29T17:37:43.446184image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-01-29T17:37:34.501209image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-29T17:37:35.644497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-29T17:37:36.575925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-29T17:37:36.968420image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

admission_idpatient_idracegenderageweightadmission_type_codedischarge_disposition_codeadmission_source_codetime_in_hospitalpayer_codemedical_specialtyhas_prosthesiscomplete_vaccination_statusnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesblood_typehemoglobin_levelblood_transfusionmax_glu_serumA1CresultdiureticsinsulinchangediabetesMedreadmitted
00199042938whitemale50<NA>ElectiveDischarged to homePhysician Referral1NaNNaNFalsecomplete24.039.0000diseases of the musculoskeletal system and connective tissueendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the nervous system and sense organs5A+14.5FalseNaNNaNFalseFalseFalseTrueFalse
1191962954whitemale80<NA>UrgentDischarged to homeEmergency Room3MCEmergency/TraumaFalsecomplete50.008.0001endocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the respiratory systemneoplasms9B+15.7FalseNaN>7FalseFalseFalseFalseTrue
22109707084whitefemale60<NA>EmergencyDischarged to homeEmergency Room5MDNaNFalsecomplete43.0628.0000diseases of the circulatory systemdiseases of the respiratory systemendocrine, nutritional and metabolic diseases, and immunity disorders6O+13.0FalseNaNNaNFalseTrueTrueTrueFalse
33157495374blackfemale70<NA>NaNDischarged to homeNaN2NaNNaNFalsecomplete58.058.0011diseases of the digestive systemdiseases of the genitourinary systemendocrine, nutritional and metabolic diseases, and immunity disorders9AB-13.5FalseNaN>8FalseFalseFalseTrueFalse
4482692360whitefemale<NA><NA>EmergencyDischarged/transferred to another rehab fac including rehab units of a hospitalEmergency Room12MCNaNFalsecomplete56.0116.0002diseases of the digestive systemdiseases of the digestive systemdiseases of the digestive system8A+13.0FalseNaNNaNFalseFalseFalseFalseFalse
55218016576whitefemale70<NA>UrgentDischarged to homePhysician Referral4HMNaNFalseincomplete14.0313.0000diseases of the digestive systemdiseases of the blood and blood-forming organsdiseases of the digestive system9A+13.1FalseNaNNaNFalseFalseFalseTrueTrue
66143084970whitemale60<NA>EmergencyDischarged to homeEmergency Room6BCNaNFalsecomplete62.0021.0000diseases of the respiratory systemdiseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disorders9A-14.2FalseNaN>7FalseTrueFalseTrueFalse
77227644092otherfemale70<NA>ElectiveDischarged to homePhysician Referral11NaNInternalMedicineFalseincomplete18.049.0101external causes of injurydiseases of the circulatory systemdiseases of the genitourinary system6O+12.9FalseNaNNaNFalseFalseFalseFalseFalse
8877740434whitefemale70<NA>EmergencyDischarged/transferred to SNFEmergency Room2MCNaNFalsecomplete36.009.0003endocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the skin and subcutaneous tissuediseases of the respiratory system6O-13.9FalseNaNNaNFalseTrueFalseTrueFalse
99203123016whitefemale40<NA>ElectiveDischarged to homePhysician Referral2NaNNaNFalsecomplete5.0225.0000diseases of the circulatory systemdiseases of the nervous system and sense organsdiseases of the circulatory system9A+13.2FalseNaNNaNFalseTrueTrueTrueFalse

Last rows

admission_idpatient_idracegenderageweightadmission_type_codedischarge_disposition_codeadmission_source_codetime_in_hospitalpayer_codemedical_specialtyhas_prosthesiscomplete_vaccination_statusnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesblood_typehemoglobin_levelblood_transfusionmax_glu_serumA1CresultdiureticsinsulinchangediabetesMedreadmitted
8140281402155324556othermale50<NA>EmergencyDischarged to homeEmergency Room3MDNaNFalseincomplete1.0013.0000diseases of the skin and subcutaneous tissueendocrine, nutritional and metabolic diseases, and immunity disordersdiseases of the skin and subcutaneous tissue9O+16.7FalseNaNNaNFalseTrueFalseTrueFalse
8140381403224478630blackfemale80<NA>ElectiveDischarged/transferred to home with home health serviceClinic Referral7NaNNaNFalsecomplete35.0237.0020diseases of the circulatory systemdiseases of the skin and subcutaneous tissuediseases of the circulatory system4A+13.8FalseNaNNaNFalseTrueTrueTrueTrue
8140481404228104460blackfemale50<NA>ElectiveDischarged to homePhysician Referral3NaNGynecology/ObstetricsFalsecomplete8.0212.0000neoplasmsneoplasmsdiseases of the genitourinary system9AB-12.8FalseNaNNaNFalseFalseFalseTrueFalse
814058140548436146blackfemale40<NA>EmergencyDischarged to homeEmergency Room6MCNephrologyFalseincomplete35.0322.0000diseases of the genitourinary systemdiseases of the genitourinary systemexternal causes of injury9B+11.4FalseNaNNaNFalseTrueFalseTrueFalse
814068140691915884whitefemale80<NA>EmergencyDischarged/transferred to SNFEmergency Room4MDNaNFalsecomplete46.0011.0000diseases of the genitourinary systemendocrine, nutritional and metabolic diseases, and immunity disordersendocrine, nutritional and metabolic diseases, and immunity disorders9A+13.8FalseNaNNaNFalseTrueFalseTrueFalse
814078140780746578whitemale60<NA>EmergencyDischarged/transferred to a nursing facility certified under Medicaid but not certified under MedicareEmergency Room2MDNaNFalsecomplete64.0022.0003diseases of the respiratory systemdiseases of the respiratory systemendocrine, nutritional and metabolic diseases, and immunity disorders5O+15.4FalseNaNNaNFalseTrueTrueTrueFalse
8140881408221853996blackfemale70<NA>EmergencyDischarged/transferred to home with home health serviceEmergency Room8CPInternalMedicineFalsecomplete62.0219.0001diseases of the circulatory systemdiseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disorders9O+12.8FalseNaNNaNFalseTrueTrueTrueFalse
8140981409104846580blackfemale60<NA>EmergencyDischarged/transferred to another rehab fac including rehab units of a hospitalEmergency Room5NaNNaNFalsecomplete1.0214.0000injury and poisoningdiseases of the circulatory systemendocrine, nutritional and metabolic diseases, and immunity disorders9B+13.0FalseNaNNaNFalseFalseFalseFalseTrue
8141081410229820346whitefemale50<NA>ElectiveDischarged to homeClinic Referral1NaNNaNFalsecomplete30.0110.0000injury and poisoningcongenital anomaliesendocrine, nutritional and metabolic diseases, and immunity disorders7AB+13.3TrueNaNNaNFalseFalseFalseTrueFalse
814118141149302180blackfemale40<NA>EmergencyDischarged to homeEmergency Room3NaNInternalMedicineFalsecomplete43.0014.0000endocrine, nutritional and metabolic diseases, and immunity disordersneoplasmsexternal causes of injury4B+12.4FalseNaNNaNFalseTrueTrueTrueFalse